Attribute Extraction from Conjectural Queries
نویسنده
چکیده
Conjectural search queries (is python case sensitive, is millennium stadium heated) embody attempts by Web users to verify whether a particular property (soluble in water?, case sensitive?, heated?) does or does not apply to a particular instance (iodine, python, millennium stadium). This paper considers such queries to be a data source of attributes of open-domain classes. Conjectural attributes complement attributes encoded in human-compiled knowledge resources or automatically acquired from text by previous methods. They correspond to properties of interest to Web users, which are not necessarily stated in nominal form. Relevant properties of Chemical elements, Programming languages and Stadiums include whether they are soluble in water, flammable or ductile; case sensitive, platform independent, or interpreted; or air conditioned, roof retractable or heated, respectively. Experimental results show that relevant, conjectural attributes can be extracted from inherently-noisy queries, for a variety of open-domain classes of interest.
منابع مشابه
The Role of Query Sessions in Extracting Instance Attributes from Web Search Queries
Per-instance attributes are acquired using a weakly supervised extraction method which exploits anonymized Web-search query sessions, as an alternative to isolated, individual queries. Examples of these attributes are top speed for chevrolet corvette, or population density for brazil). Inherent challenges associated with using sessions for attribute extraction, such as a large majority of withi...
متن کاملLow-Cost Supervision for Multiple-Source Attribute Extraction
Previous studies on extracting class attributes from unstructured text consider either Web documents or query logs as the source of textual data. Web search queries have been shown to yield attributes of higher quality. However, since many relevant attributes found in Web documents occur infrequently in query logs, Web documents remain an important source for extraction. In this paper, we intro...
متن کاملTurning Web Text and Search Queries into Factual Knowledge: Hierarchical Class Attribute Extraction
A seed-based framework for textual information extraction allows for weakly supervised acquisition of open-domain class attributes over conceptual hierarchies, from a combination of Web documents and query logs. Automaticallyextracted labeled classes, consisting of a label (e.g., painkillers) and an associated set of instances (e.g., vicodin, oxycontin), are linked under existing conceptual hie...
متن کاملLightly-Supervised Attribute Extraction
Web search engines can greatly benefit from knowledge about attributes of entities present in search queries. In this paper, we introduce lightly-supervised methods for extracting entity attributes from natural language text. Using these methods, we are able to extract large numbers of attributes of different entities at fairly high precision from a large natural language corpus. We compare our...
متن کاملAttribute Extraction from Synthetic Web Search Queries
The accuracy and coverage of existing methods for extracting attributes of instances from text in general, and Web search queries in particular, are limited by two main factors: availability of input textual data to which the methods can be applied, and inherent limitations of the underlying assumptions and algorithms being used. This paper proposes a weakly-supervised approach for the acquisit...
متن کامل